Automatic conceptual indexing of French pharmaceutical theses.

نویسندگان

  • Vincent Mary
  • Bruno Pouliquen
  • Franck Le Duff
  • Stefan J Darmoni
  • Alain Segui
  • Pierre Le Beux
چکیده

French pharmaceutical theses are rarely quoted. If the main obstacles originate from language or access barriers, proper indexation could also be blamed. Manually extracted key-words don't necessary come from a structured thesaurus. In the following work, this manual indexing method is compared to an automated one, "Nomindex", based on UMLS. The automated method is improved by the addition of a relevance scoring system. The first indexing step consists of downloading, adapting and indexing theses in electronic format. Results will then be analyzed and sorted by relevance, through the comparison of classic statistical indices (noise, silence and relevance). It was assumed that the manually obtained key-words were always relevant. The silence of manual indexing is nevertheless high: seven new key-words are proposed by Nomindex, which results are mixed (10% of silence, but 50% of noise). These results are promising on the first experiment on pharmaceutical document without lexicon improvement. The indexing, if it is currently insufficient for a real life use, could easily be improved by specific updates of the lexicon.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

مدل دو مرحله ای شکاف- گلچین برای نمایه سازی خودکار متون فارسی

Purpose: Each language has its own problems. This leads to consider appropriate models for automatic indexing of every language. These models should concern the exhaustificity and specificity of indexing.   This paper aims at introduction and evaluation of a model which is suited for Persian automatic indexing. This model suggests to break the text into the particles of candidate terms and to c...

متن کامل

Bibliographic database access using free-text and controlled vocabulary: an evaluation

This paper evaluates and compares the retrieval effectiveness of various search models, based on either automatic text-word indexing or on manually assigned controlled descriptors. Retrieval is from a relatively large collection of bibliographic material written in French. Moreover, for this French collection we evaluate improvements that result from combining automatic and manual indexing. Fir...

متن کامل

Evaluation of a Simple Method for the Automatic Assignment of MeSH Descriptors to Health Resources in a French Online Catalogue

BACKGROUND The growing number of resources to be indexed in the catalogue of online health resources in French (CISMeF) calls for curating strategies involving automatic indexing tools while maintaining the catalogue's high indexing quality standards. OBJECTIVE To develop a simple automatic tool that retrieves MeSH descriptors from documents titles. METHODS In parallel to research on advanc...

متن کامل

Improving Automatic Indexing through Concept Combination and Term Enrichment

Although indexes may overlap, the output of an automatic indexer is generally presented as a flat and unstructured list of terms. Our purpose is to exploit term overlap and embedding so as to yield a substantial qualitative and quantitative improvement in automatic indexing through concept combination. The increase in the volume of indexing is 10.5% for free indexing and 52.3% for controlled in...

متن کامل

Using multi-terminology indexing for the assignment of MeSH descriptors to health resources in a French online catalogue

BACKGROUND To assist with the development of a French online quality-controlled health gateway(CISMeF), an automatic indexing tool assigning MeSH descriptors to medical text in French was created. The French Multi-Terminology Indexer (FMTI) relies on a multi-terminology approach involving four prominent medical terminologies and the mappings between them. OBJECTIVE In this paper,we compare le...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Studies in health technology and informatics

دوره 90  شماره 

صفحات  -

تاریخ انتشار 2002